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Abstract 


The  ratio  of  the  values  of  optimal  integer  and  fractional  solutions 
to  a  set  covering  problem  was  shown  by  Johnson  [51  and  Lovasa  [63  to  be 
bounded  by  B(d)  *  1  +2nd,  where  d  is  the  largest  column  sum.  We  show 
that  if  n  is  the  number  of  variables,  B(n)  =  —  1  ~~  j  I  is\  best  possible 


bound  on  this  ratio.  Furthermore,  for  every  n  >  20  there  are  problems  for 
which  B(n)  <  j^B(d). 


OPTIMAL  INTEGER  AND  FRACTIONAL 
COVERS:  A  SHARP  BOUND  ON  THEIR  RATIO 

by 

Egon  Balas 

The  simple  (unweighted)  set  covering  problem  is 
(C)  zc  *  min{enx)Ax  >  e^,  x  binary}, 

where  A  is  an  m  X  n  0-1  matrix  and  for  k  =  m,  n,  is  the  k-vector  whose 
components  are  all  equal  to  1,  while  x  is  an  n-vector  of  variables. 

If  the  0-1  condition  on  the  variables  is  relaxed  to  nonnegativity,  we 
obtain  the  continuous  or  fractional  set  covering  problem 

(F)  z_  *  minfe  xlAx  >  e  ,  x  >  Oj. 

A  vector  x  that  satisfies  the  constraints  of  (C)  (of  (F))  will  be  called 
a  cover  (fractional  cover) . 

The  set  covering  problem  is  known  to  be  NP-complete.  One  of  the  best 
known  procedures  for  finding  a  cover  that  approximates  the  optimum  is  the  greedy 
heuristic,  which  consists  of  a  sequence  of  steps,  each  of  which  assigns  the 
value  1  to  a  variable  whose  column  covers  a  maximal  number  of  additional  rows. 

The  worst  case  behavior  of  the  greedy  heuristic  for  the  (unweighted)  set  covering 
problem  was  shown  by  Johnson  [5]  and  LoWsz  [6]  to  be  given  by  the  relation 
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where  z  is  the  value  of  a  cover  obtained  by  the  greedy  heuristic, 

G 

m 

d  =  max  Z  a . . , 
je(l,...,n}  i=l  1J 

and 

d  ! 

H(d)  =  E  7  . 

j=l  J 

Thus  the  ratio  between  the  value  of  a  ’’greedy"  cover  and  that  of  an  optimal 
fractional  cover  increases  at  most  with  the  logarithm  of  the  largest  column  sum. 

Chvatal  [2]  has  shown  that  the  worst  case  bound  given  by  (1)  is  also 
valid  for  the  greedy  heuristic  when  applied  to  the  weighted  set  covering  problem 
with  arbitrary  but  positive  cost  coefficients  c^ ,  j  =  l,...,n.  If  kjt  repre¬ 
sents  the  number  of  new  rows  covered  by  column  j  at  step  t,  the  greedy  heuristic 
for  the  weighted  set  covering  problem  assigns  the  value  1  at  step  t  to  a  variable 
Xj  whose  choice  maximizes  k^/c^.  Furthermore,  Ho  [3]  has  shown  that  the  bound 
given  by  (1)  is  best  possible  for  any  (weighted)  set  covering  heuristic  that 
assigns  the  value  1  at  step  t  to  a  variable  x^  whose  choice  maximizes  some 

arbitrary  function  f(c,,  k  ). 

J  J  £ 

Another  class  of  heuristics,  which  uses  information  (reduced  costs) 
obtained  from  a  (not  necessarily  optimal)  solution  to  the  dual  linear  program, 
has  consistently  outperformed  in  empirical  tests  the  greedy  heuristic  and  its 
above  mentioned  generalizations  (see  Balas  and  Ho  [1]),  but  no  worst  case 
bound  better  than  (or  comparable  to)  (1)  is  known  for  it  (see  Hochbaum  [4]  for 
a  discussion  of  bounds  for  this  heuristic). 

Sinze  zG  >  >  z^,,  the  relation  (1)  implies  of  course  both 

zr 

T  Z  H<d> 

ZC 


(2) 
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(3)  ri<H(d)  . 

ZF 

However,  while  H(d)  is  a  best  possible  bound  for  both  z^l z?  and  zG/zc, 
it  was  until  recently  an  open  question  whether  it  is  also  a  best  possible  bound 
for  zn/z,  since  no  better  bound  than  H(d)  was  known  for  this  latter  ratio. 

In  this  paper  we  give  a  best  possible  bound  on  the  value  of  zq/zf  for 
unweighted  set  covering  problems,  as  a  function  of  the  number  n  of  columns,  for 
an  arbitrary  number  of  rows.  For  every  value  of  n,  there  are  problems  for  which 
this  bound  has  a  value  of  approximately  H(d). 

For  an  arbitrary  0-1  matrix  A,  we  will  denote  by  z^^  and  zp(A) 
value  of  an  optimal  solution  to  the  (unweighted)  set  covering  problem  defined 
by  A,  and  to  the  fractional  set  covering  problem  defined  by  A,  respectively. 

Let  dR  denote  the  class  of  0-1  matrices  with  at  most  n  columns,  and  let 

j^n(p)  -  {Ac^n|zc(A)  =  p}  . 


Theorem  1.  For  any  positive  integer  n  and  any  pe[l,...,n}. 


(4)  min  -  — TT  > 

Aei  (p)  F(A)  n 


and  the  minimum  in  (4)  is  attained  for  the  \^j  X  n  matrix  A*  whose  rows  are  all 
the  distinct  0-1  n-vectors  with  exactly  n-p+1  components  equal  to  1. 

Proof.  We  first  show  that  A*e^n(p).  A*  has  n  columns  by  assumption. 

Any  binary  n-vector  x  having  at  least  p  components  equal  to  1  satisfies  A*x  >  e^, 
where  q  ■  since  no  row  of  A*  has  more  than  p-1  entries  equal  to  0.  Further, 

every  binary  n-vector  x  with  at  most  p-1  components  equal  to  1  violates  the 
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inequality  corresponding  to  that  particular  row  of  A*,  whose  p-1  entries  equal 

to  0  include  those  positions  where  x  =  0.  Thus  z  ,  .  =  p,  i.e.,  A*e^n(p). 

j  h.a*j 

Next  we  show  that  ZF(-A*)  =  n/(n-p-fl).  Let  k  =  n-p+1,  and  let  x  be 
defined  by  x^  =  1/k,  j  =  l,...,n.  Let  B  be  any  n  X  n  nonsingular  submatrix 
of  A*,  such  that  every  column  of  B  has  exactly  k  entries  equal  to  1.  The 
definition  of  A*  guarantees  the  existence  of  B.  Now  let  u  be  the  q-vector 

^  fch  j— u 

defined  by  =  1/k  if  the  i  row  of  A*  is  a  row  of  B,  u^  *  0  otherwise.  Then 

x  and  u  are  feasible  solutions  to  the  linear  program  min{e  x|A*x  >  e„,  x  >  0} 

n  —  4  — 

and  its  dual,  respectively,  with  value  e^x  =  e^u  =  n/k.  Hence  x  is  an  optimal 
fractional  cover,  and  =  =  n/(n-p+l). 

Finally,  we  show  that  A*  minimizes  z  .  over  A?n(p).  Assume  this  to  be 

F  (A) 

false,  and  let  A°  be  a  matrix  that  minimizes  ^(k)  overi/n(p),  with  zF(Ao^  <  zF(a*) 
Also,  let  A*  =  (a*j),  A°  *  (a°j).  W.l.o.g.,  we  may  assume  that  A°  has  n  columns, 
since  adding  columns  whose  entries  are  all  equal  to  0  does  not  change  either  the 

integer  or  the  fractional  optimum.  For  every  Sc{l . n]  such  that  |s|  =  p-1, 

A°  has  a  row  i  such  that  a°^  =  0,  ¥■  jeS;  or  else  x  defined  by  =  1,  jeS, 
xj  *  0*  j^S,  would  be  a  cover  with  value  p-1,  contrary  to  the  assumption  that 
A°  e^n(p).  Hence  for  every  row  i  of  A*,  A°  has  a  row  k  such  that  a°  <  a*, 

Kj  ij 

j  *  l,...,n.  But  then  x  >  0,  A°x  >  e^  implies  A*x  >  e^  (where  r  is  the  number 
of  rows  of  A°) ,  hence  z-p(A*)  —  ZF(A °)  *  3  contradiction'!l 
Theorem  2.  For  any  A  edn , 


C(k)  1  I  n+1  j  rn»l~! 

z™  -  n  L  2  J  2  I 


and  this  is  a  best  possible  bound. 


Proof.  For  fixed  pe{l,...,n},  from  Theorem  1 


(6)  max  ;C(‘A^  =  £  (n-p+1) . 

Aei  (p)  F(A)  n 


If  p  is  allowed  to  vary  continuously  in  the  interval  [1,  n] ,  the  right 

hand  side  of  (6)  is  concave  and  attains  its  maximum  for  p  =  (n+l)/2.  Since  p 

n+l  j 


r  n+l" 


has  to  be  integer,  the  maximum  is  attained  either  for  p  =  j  j,  or  for  p  = 

i-  2  J  12 

namely. 


ZC(A)  Cl  I  n+l  1  /  i  n+l  !  .  ,N  1  Tn+l-!  (  l- n+l' 

max  — »  max  S—  — r-  i  n-  :  +  li.  —  — —  in-  — r— 
A£/  ZF(A)  U  L  2  ^  V  .  2  j  /’  n  !  2  ■  V  I  2 


+  1 


1  I  n+l  f n+l" 

n  L  2  J  '  2  i 


Another  expression  for  the  above  bound  is  given  by 


(7) 


1  I  n+l  :  P  n+l  "J 


n  L  2  J  I  2 


!  =  < 


n  +  I 

4  +  2 


if  n  is  even 


l  +  1  +  1- 

4 


4n 


if  n  is  odd. 


Thus,  the  n  variables  set  covering  problem  for  which  the  ratio  z„,../z„... 

C (A)  F(A) 


attains  its  maximum,  is  the  one  whose  coefficient  matrix  has  exactly  [j+p-  l's  in 


!  n+l  : 


every  row,  and  contains  as  a  row  every  binary  n-vector  with  j  components 

2n 


Tn+l-5 

equal  to  1.  For  this  problem,  zc(A)  =  \  — t  and  where  5  =  0 


if  n  is  even  and  6  =  1  if  n  is  odd. 

Before  concluding  our  paper,  we  compare  the  bound  on  zc(a)^zf(A)  8*-vetl 
in  Theorem  2,  with  the  bound  on  zg(A)^zF(A)  given  by  (1).  To  do  this,  we  note 
that  when  we  consider  the  bound  H(d)  given  by  (1)  for  all  set  covering  problems 


defined  by  matrices  A  t»,  the  largest  d  that  can  occur  (provided  A  has  no 
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componentwise  equal  rows),  happens  to  occur  for  the  matrix  A*  having 
all  possible  0-1  n-vectors  with  exactly  J  components  equal  to  1. 
matrix,  we  denote  d(A*)  =  d*,  and  we  have 


d*  =1 


n-1 


!  n+1  ' 

L  2  J  " 


We  want  to  assess  the  value  of  the  ratio 


(8) 


_  1  +  2/n  d* 

R  =  1)  U±I  fn+l~: 
nt  2  jl  2  i 


Theorem  3.  For  n  >  2, 

<9)  • 

Proof.  From  (8),  we  have 


Using  Stirling's  formula  as  refined  by  Robbins, 

<,VW/2e1/<12’+I>  <  ,1  <  ..Van,)1''2.1"2’  , 


as  rows 
For  this 


we  have 
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,  I  n-1  I  ,  f  n-l~1  ,  ,  .  g-1  ^  .  n-1 

Using  j_— j  +  |  2  |  =  n-1  and  m  ^  >in  pn.1-i  , 

L  2  J  1  2  I 


we  obtain 


(ID 


.  (n+1)  (n-1)  ,  n-1  _ n  /,  n-1  .  n-1 

>  I  n+1  fn+T’  271  f  n-1*’  +  |  n+1  if  n+1  ^  6  "  n  fn-ll 

lt J t i  It i  ltjti  \  !  2  I 


As  the  last  term  Is  nonnegative  for  n  >  2,  and 


|  n+1  I  f n+1*1  ^  (n+1)  P n-1-1  n 

L  2  J  I  2  i  —  4  *  '21—2  » 


inequality  (11)  implies  (9) .  || 

The  value  of  the  righthand  side  in  (9)  is  2.5  for  n  =  20,  and  it 
approaches  the  constant  4  2n  2  ~  2.769  as  n  goes  to  infinity.  Thus  for  the 
problems  for  which  d  =  d*,  the  bound  on  zc/zF  Is  about  1/2.7  of  the  bound  on 
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